Fix missing metadata fields in label_referenced_variable_metadata operation #1247
+153
−0
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The
label_referenced_variable_metadataoperation was failing with "Column not found in data" error for$label_referenced_variable_rolewhen the variables metadata didn't contain arolefield for any variable, as shown in the error below:This was blocking rule CG0173 which relies on the
$label_referenced_variable_roleand$label_referenced_variable_namemetadata fields.Root Cause
The operation creates a DataFrame from metadata records returned by
_get_variables_metadata_from_standard(). However, when certain fields (likerole) are missing from all variables in the metadata, those columns are not created in the DataFrame. This causes a KeyError when rules try to access the expected metadata fields.For example, if variables metadata looks like:
[ {'name': 'FATESTCD', 'ordinal': 1, 'label': 'Test Code'}, # No role field {'name': 'FATEST', 'ordinal': 2, 'label': 'Test Name'}, # No role field ]The resulting DataFrame would not have a
rolecolumn, causing$label_referenced_variable_roleto be missing.Solution
Modified the
LabelReferencedVariableMetadataoperation to ensure all expected metadata fields (name,role,ordinal,label) are always present in the metadata records before creating the DataFrame. Missing fields are filled with empty strings.Changes Made
Updated
cdisc_rules_engine/operations/label_referenced_variable_metadata.py:Added comprehensive test
test_get_label_referenced_variable_metadata_missing_role_field:Testing
The fix ensures that the
$label_referenced_variable_roleand$label_referenced_variable_namemetadata fields are always available for rule checks, resolving the CG0173 rule blocking issue.Fixes #1202.
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.